[Wait for #2286] [SWAP] Implement inference mode #2300

jihochu · 2023-08-31T10:49:18Z

This patch is for inference mode for a swap device.
It adds "memory_swap_mode" property, which can handle inference mode when using swap device.
The weights are not modified while inferencing, therefore it doesn't need to swap out data to the device.

Signed-off-by: Jiho Chu [email protected]

- Match requestMemory arguments with memory_pool - Added override keyword Signed-off-by: hyeonseok lee <[email protected]>

- To support scaled dot product on attention layer as described in paper "attention all you need" add scaled dot product property Signed-off-by: hyeonseok lee <[email protected]>

- To provide dynamic input dimension implement reinitialize function - This commit is PoC of reinitialize so many of codes are just copy & paste of initialize function. Needs to refine this commit. Signed-off-by: hyeonseok lee <[email protected]>

- Added causal mask in attention layer - Implements PicoGPT Signed-off-by: hyeonseok lee <[email protected]>

Implementing picoGPT/GPT2's Encoder in CPP using nlohman/json.hpp file so we need to add or make some path to compile json parser Signed-off-by: Donghak PARK <[email protected]>

Add PicoGPT's user input Add Comment in encoder.hpp Signed-off-by: Donghak PARK <[email protected]>

This PR includes the PicoGPT(https://github.com/jaymody/picoGPT) Android Application with NNTrainer. We only use the PicoGPT Model Binary and provides the NNTrainer implementation nnstreamer#2212. This is the Android application implementation for that PR. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

- PoC of incremental inference - Only works if batch, channel size is 1 - For the concat layer, inference step only works if axis dimension is width axis Signed-off-by: hyeonseok lee <[email protected]>

- Each threads will copy the data with batchwise direction Signed-off-by: hyeonseok lee <[email protected]>

- Apply incremental inference to pico gpt Signed-off-by: hyeonseok lee <[email protected]>

This PR includes fixes for running GPT. Signed-off-by: jijoong.moon <[email protected]>

This pr includes some fixes to run PicoGPT with W16A16 on Android using NEON. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR includes, - Fixes to enable memory optimization - remove unnecessary memory buffer **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

Signed-off-by: Jiho Chu <[email protected]>

taos-ci · 2023-08-31T10:49:20Z

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2300. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci · 2023-08-31T11:03:24Z

cibot: @jihochu, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202308311949210.47621703147888-70967764de236a28f1c8ab1a1c5d83aaff745c49/.

Signed-off-by: Jiho Chu <[email protected]>

DonghakPark

LGTM!

taos-ci · 2023-09-03T14:58:07Z

cibot: @jihochu, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309032343410.87508106231689-73f568beb28f0012aacd1d79664903387cce7dc4/.

myungjoo · 2023-09-04T01:24:17Z

Please remove generated files including *.lock

This patch is for inference mode for swap device. It re-enable mmap feature, but writing time is controlled manually, due to the inference mode handling. Signed-off-by: Jiho Chu <[email protected]>

taos-ci · 2023-09-07T10:54:10Z

We generate a report if there are dangerous coding constructs in your code. Please read http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309071954050.96274399757385-14e9875d785f395a0d9a2d092d31a6aac38abcc9/report/.

taos-ci · 2023-09-07T10:54:15Z

INFO: You can read if there are misspelled characters at our misspelling check report. Please read http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309071954050.96274399757385-14e9875d785f395a0d9a2d092d31a6aac38abcc9/report/.

taos-ci · 2023-09-07T11:13:32Z

cibot: @jihochu, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309071954050.96274399757385-14e9875d785f395a0d9a2d092d31a6aac38abcc9/.

This patch removes unnecessary files. Signed-off-by: Jiho Chu <[email protected]>

taos-ci · 2023-09-07T12:05:21Z

We generate a report if there are dangerous coding constructs in your code. Please read http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309072105160.96397399902344-b0eed2c2a07c36c863b09b4eeb84530b8f0348a0/report/.

taos-ci · 2023-09-07T12:05:26Z

INFO: You can read if there are misspelled characters at our misspelling check report. Please read http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309072105160.96397399902344-b0eed2c2a07c36c863b09b4eeb84530b8f0348a0/report/.

taos-ci · 2023-09-07T12:27:00Z

cibot: @jihochu, A builder checker could not be completed because one of the checkers is not completed. In order to find out a reason, please go to http://ci.nnstreamer.ai/nntrainer/ci/repo-workers/pr-checker/2300-202309072105160.96397399902344-b0eed2c2a07c36c863b09b4eeb84530b8f0348a0/.

jihochu · 2023-09-08T04:57:40Z

Please remove generated files including *.lock

thanks. It removed lock files.

lhs8928 and others added 14 commits August 25, 2023 15:38

[cache_pool] bug fix request memory

455125c

- Match requestMemory arguments with memory_pool - Added override keyword Signed-off-by: hyeonseok lee <[email protected]>

[attention] add scaled dot product on attention layer

04249dc

- To support scaled dot product on attention layer as described in paper "attention all you need" add scaled dot product property Signed-off-by: hyeonseok lee <[email protected]>

[Poc] implement reinitialize

1fb2544

- To provide dynamic input dimension implement reinitialize function - This commit is PoC of reinitialize so many of codes are just copy & paste of initialize function. Needs to refine this commit. Signed-off-by: hyeonseok lee <[email protected]>

[PoC] implements PicoGPT

5660820

- Added causal mask in attention layer - Implements PicoGPT Signed-off-by: hyeonseok lee <[email protected]>

[WIP][POC] Implements picoGPT Encoder

adf33a3

Implementing picoGPT/GPT2's Encoder in CPP using nlohman/json.hpp file so we need to add or make some path to compile json parser Signed-off-by: Donghak PARK <[email protected]>

[PoC] Add User Input, Comment

811add6

Add PicoGPT's user input Add Comment in encoder.hpp Signed-off-by: Donghak PARK <[email protected]>

[PoC] incremental inference

d49bf43

- PoC of incremental inference - Only works if batch, channel size is 1 - For the concat layer, inference step only works if axis dimension is width axis Signed-off-by: hyeonseok lee <[email protected]>

[concat] enable incremental forwarding with multi threads

b3ff8ea

- Each threads will copy the data with batchwise direction Signed-off-by: hyeonseok lee <[email protected]>

[Application] apply incremental inference to pico gpt

4abac78

- Apply incremental inference to pico gpt Signed-off-by: hyeonseok lee <[email protected]>

[ Application ] Fix for running GPT

c437f7b

This PR includes fixes for running GPT. Signed-off-by: jijoong.moon <[email protected]>

[ FP16 ] Run PicoGPT with W16A16

d768e94

This pr includes some fixes to run PicoGPT with W16A16 on Android using NEON. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

[SWAP] Add swap mode property

e6bdba5

Signed-off-by: Jiho Chu <[email protected]>

jihochu requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, helloahn, kparichay, gichan-jang, anyj0527, zhoonit and a team as code owners August 31, 2023 10:49

github-actions bot added the Need Review label Aug 31, 2023

[SWAP] Add inference mode

ad06bc4

Signed-off-by: Jiho Chu <[email protected]>

DonghakPark approved these changes Sep 1, 2023

View reviewed changes

jihochu force-pushed the picogpt_swap branch from 7096776 to 73f568b Compare September 3, 2023 14:43

[SWAP] Modify cache for inference mode

46547d7

This patch is for inference mode for swap device. It re-enable mmap feature, but writing time is controlled manually, due to the inference mode handling. Signed-off-by: Jiho Chu <[email protected]>

jihochu force-pushed the picogpt_swap branch from 73f568b to 46547d7 Compare September 7, 2023 10:47

[Android] Remove unnecessary .lock files

b0eed2c

This patch removes unnecessary files. Signed-off-by: Jiho Chu <[email protected]>

jihochu force-pushed the picogpt_swap branch from 14e9875 to b0eed2c Compare September 7, 2023 12:05

jijoongmoon requested review from lhs8928, songgot, SeoHyungjun, baek2sm, skykongkong8, djeong20 and EunjuYang as code owners June 10, 2024 22:59

lhs8928 mentioned this pull request Aug 5, 2024

[SWAP] Implement inference mode #2696

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Wait for #2286] [SWAP] Implement inference mode #2300

[Wait for #2286] [SWAP] Implement inference mode #2300

jihochu commented Aug 31, 2023

taos-ci commented Aug 31, 2023

taos-ci commented Aug 31, 2023

DonghakPark left a comment

taos-ci commented Sep 3, 2023

myungjoo commented Sep 4, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

jihochu commented Sep 8, 2023

[Wait for #2286] [SWAP] Implement inference mode #2300

Are you sure you want to change the base?

[Wait for #2286] [SWAP] Implement inference mode #2300

Conversation

jihochu commented Aug 31, 2023

taos-ci commented Aug 31, 2023

taos-ci commented Aug 31, 2023

DonghakPark left a comment

Choose a reason for hiding this comment

taos-ci commented Sep 3, 2023

myungjoo commented Sep 4, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

taos-ci commented Sep 7, 2023

jihochu commented Sep 8, 2023